Constructing Near-Perfect Phylogenies with multiple homoplasy events
نویسندگان
چکیده
MOTIVATION We explore the problem of constructing near-perfect phylogenies on bi-allelic haplotypes, where the deviation from perfect phylogeny is entirely due to homoplasy events. We present polynomial-time algorithms for restricted versions of the problem. We show that these algorithms can be extended to genotype data, in which case the problem is called the near-perfect phylogeny haplotyping (NPPH) problem. We present a near-optimal algorithm for the H1-NPPH problem, which is to determine if a given set of genotypes admit a phylogeny with a single homoplasy event. The time-complexity of our algorithm for the H1-NPPH problem is O(m2(n + m)), where n is the number of genotypes and m is the number of SNP sites. This is a significant improvement over the earlier O(n4) algorithm. We also introduce generalized versions of the problem. The H(1, q)-NPPH problem is to determine if a given set of genotypes admit a phylogeny with q homoplasy events, so that all the homoplasy events occur in a single site. We present an O(m(q+1)(n + m)) algorithm for the H(1,q)-NPPH problem. RESULTS We present results on simulated data, which demonstrate that the accuracy of our algorithm for the H1-NPPH problem is comparable to that of the existing methods, while being orders of magnitude faster. AVAILABILITY The implementation of our algorithm for the H1-NPPH problem is available upon request.
منابع مشابه
FPT Algorithms for Binary Near-Perfect Phylogenetic Trees
We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect...
متن کاملSimple Reconstruction of Binary Near-Perfect Phylogenetic Trees
We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect...
متن کاملMicrosatellites retain phylogenetic signals across genera in eucalypts (Myrtaceae)
The utility of microsatellites (SSRs) in reconstructing phylogenies is largely confined to studies below the genus level, due to the potential of homoplasy resulting from allele size range constraints and poor SSR transferability among divergent taxa. The eucalypt genus Corymbia has been shown to be monophyletic using morphological characters, however, analyses of intergenic spacer sequences ha...
متن کاملThe morphological state space revisited: what do phylogenetic patterns in homoplasy tell us about the number of possible character states?
Biological variety and major evolutionary transitions suggest that the space of possible morphologies may have varied among lineages and through time. However, most models of phylogenetic character evolution assume that the potential state space is finite. Here, I explore what the morphological state space might be like, by analysing trends in homoplasy (repeated derivation of the same characte...
متن کاملCocos: Constructing multi-domain protein phylogenies Œ PLOS Currents Tree of Life
Phylogenies of multi-domain proteins have to incorporate macro-evolutionary events, which dramatically increases the complexity of their construction. We present an application to infer ancestral multi-domain proteins given a species tree and domain phylogenies. As the individual domain phylogenies are often incongruent, we provide diagnostics for the identification and reconciliation of implau...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 22 14 شماره
صفحات -
تاریخ انتشار 2006